A Review on Urdu Language Parsing
نویسندگان
چکیده
-Natural Language Processing is the multidisciplinary area of Artificial Intelligence, Machine Learning and Computational Linguistic for processing human language automatically. It involves understanding and processing of human language. The way through which we share our contents or feelings have always great importance in understanding and processing of language. Parsing is the most suited approach in identifying and scanning what the available sentences expressed? Parsing is the process in which syntactic structure of sentence is identified using grammatical tags. The syntactically correct sentence structure is achieved by assigning grammatical labels to its constituents using lexicon and syntactic rules. Phrase and Dependency are two main structure formalisms for parsing natural language sentences. The growing use of web 2.0 has produced novel research challenges as people from different geographical areas are using this channel and sharing contents in their native languages. Urdu is one of such free word order native language which is widely shared over social media sites but identification and summarization of Urdu sentences is challenging task. In this review paper we present an overview to recent work in parsing of fixed order (i.e. English) and free word order languages (i.e Urdu) in order to reveal the most suited method for Urdu Language Parsing. This survey explored that dependency parsing is more appropriate for Urdu and other free word order languages and parsers of English language are not useful in parsing Urdu sentence due to its morphological, syntactical and grammatical differences. Keywords—Natural Language Processing; Machine Learning; Urdu Language Processing and Dependency Parsing
منابع مشابه
Morphologically rich Urdu grammar parsing using Earley algorithm
This work presents the development and evaluation of an extended Urdu parser. It further focuses on issues related to this parser and describes the changes made in the Earley algorithm to get accurate and relevant results from the Urdu parser. The parser makes use of a morphologically rich context free grammar extracted from a linguistically-rich Urdu treebank. This grammar with sufficient enco...
متن کاملExploiting Language Variants Via Grammar Parsing Having Morphologically Rich Information
In this paper, the development and evaluation of the Urdu parser is presented along with the comparison of existing resources for the language variants Urdu/Hindi. This parser was given a linguistically rich grammar extracted from a treebank. This context free grammar with sufficient encoded information is comparable with the state of the art parsing requirements for morphologically rich and cl...
متن کاملAdjectival Phrases as the Sentiment Carriers in the Urdu Text
In this paper we present a comprehensive overview of the structures of the adjectival phrases in the Urdu language with respect to the task of sentiment analysis. Urdu is a widely spoken but one of the least explored languages by the computational linguistics community. After a detailed analysis of adjectival phrases in Urdu text we conclude that this language is orthographically, morphological...
متن کاملUrdu in a parallel grammar development environment
Abstract. In this paper, we report on the role of the Urdu grammar in the Parallel Grammar (ParGram) project (Butt et al., 1999; Butt et al., 2002). The Urdu grammar was able to take advantage of standards in analyses set by the original grammars in order to speed development. However, novel constructions, such as correlatives and extensive complex predicates, resulted in expansions of the anal...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کامل